Extracting semistructured data from the Web: An XQuery Based Approach
نویسندگان
چکیده
This paper describes work in progress concerning the extraction of information from the web. This work is a part of frameworks consisting to extract, interconnect and access heterogeneous data sources. In this paper, we present a new approach for information extraction from the web. In this approach the web is viewed as a large database containing XML documents. The XQuery language is used in order to extract information from this database. An experimental tool has been developed in order to validate this proposal.
منابع مشابه
Mining Association Rules from XML Data using XQuery
In recent years XML has became very popular for representing semistructured data and a standard for data exchange over the web. Mining XML data from the web is becoming increasingly important. Several encouraging attempts at developing methods for mining XML data have been proposed. However, efficiency and simplicity are still a barrier for further development. Normally, pre-processing or post-...
متن کاملHigh Fuzzy Utility Based Frequent Patterns Mining Approach for Mobile Web Services Sequences
Nowadays high fuzzy utility based pattern mining is an emerging topic in data mining. It refers to discover all patterns having a high utility meeting a user-specified minimum high utility threshold. It comprises extracting patterns which are highly accessed in mobile web service sequences. Different from the traditional fuzzy approach, high fuzzy utility mining considers not only counts of mob...
متن کاملThe XML Query Language Xcerpt: Design Principles, Examples, and Semantics
Most query and transformation languages developed since the mid 90es for XML and semistructured data – e.g. XQuery [1], the precursors of XQuery [2], and XSLT [3] – build upon a path-oriented node selection: A node in a data item is specified in terms of a root-to-node path in the manner of the file selection languages of operating systems. Constructs inspired from the regular expression constr...
متن کاملQuery Algebra for Semistuctured Data
With the tremendous growth of World Wide Web (WWW) data, there is an emerging need for effective information retrieval at the document level. Several query languages such as XML-QL, XPath, XQL, Quilt and XQuery are proposed in recent years to provide faster way of querying XML data, but they still lack of generality and efficiency. Our approach towards evolving a framework for querying semistru...
متن کاملXcerpt and visXcerpt: From Pattern-Based to Visual Querying of XML and Semistructured Data
With the advent of XML as a format for data exchange and semistructured databases, query languages for XML and semistructured data have become increasingly popular. Many such query languages, like XPath and XQuery, are navigational in the sense that their variable binding paradigm requires the programmer to specify path navigations through the document (or data item). In contrast, some other la...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001